-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add manual iterator implementations for custom vec iterables #1107
Conversation
This adds manual `Iterator` implementors for the custom vec `Iterable` pyclasses, which were previously using the Python default iterator implementation that attempts calls to `__getitem__` with successive integers until an `IndexError` is raised. Doing this means that we no longer have any overhead from converting Python-space integer indices to Rust-space ones, because we just store a pointer to the backing vec internally and yield from that using an index we track ourselves.
Pull Request Test Coverage Report for Build 7995769315Details
💛 - Coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great to me, it's basically following what we did for mapping structs already with keys, values, and items but does it for the vec structs too. The performance improvements are really nice here too.
My only 2 thoughts are do you want to add a test using iter()
explicitly in https://github.com/Qiskit/rustworkx/blob/main/tests/test_custom_return_types.py and also maybe a release not documenting the new feature?
Sure, I can add those. Flicking quickly through the tests now, I suppose there's a question of whether we want to implement more efficient I can throw in a reversed iterator for each of these if you like as well, but it equally might not be worth whatever increase in compilation times it comes with? |
Actually: what kind of |
Release node added in 8f9a991. |
This matches the Python behaviour for the implicit sequence-protocol iterators; they have only `__length_hint__` and not `__len__`.
I was just thinking of something that explicitly exercises it, mostly to say we did. I wasn't expecting a type check, I think the existing tests we have for this on mappings do something like: test = iter(object)
result = list(test)
self.assertEqual(result, expected) It might just be duplicate coverage as that implicitly happens in other tests, but I didn't see the harm in at least validating it explicitly |
That seems worth it to me, personally I'm not too worried about rustworkx compile times. It might slow down development and CI a little bit, but we're not really bottlenecked on that (if it came down to it we could leverage debug mode a lot more and tweak things to optimize more for build speed). But, making sure the library is higher performance for python users is in general a high priority, so having faster |
Ok, I've added the reversed iterators in 5f60d9e and explicit tests of both The timings of the reversed iterators are the same as the timings as the forwards iterators: this PR took the million-node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for doing this.
.borrow(py) | ||
.$data | ||
.len() | ||
.saturating_sub(self.index) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh that's a good trick to comply with __length_hint__
not returning a negative. Although as a usize I guess without this in release mode we would just overflow which wouldn't result in a ValueError
although the hint would be wildly inaccurate (although PEP424 says the return is "not required to be accurate") for an empty or consumed iterator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually wrote this initially for __len__
(which is required to be accurate) and reused it for __length_hint__
, but then I remembered / realised that iterators don't typically implement __len__
so just renamed the method.
This adds manual
Iterator
implementors for the custom vecIterable
pyclasses, which were previously using the Python default iterator implementation that attempts calls to__getitem__
with successive integers until anIndexError
is raised.Doing this means that we no longer have any overhead from converting Python-space integer indices to Rust-space ones, because we just store a pointer to the backing vec internally and yield from that using an index we track ourselves.
Using the same setup code as discussed in #1096, with a manual iteration through afterwards:
On
main
(which includes #1096), thetimeit
measured the iteration at ~78ms, and with this PR it dropped to ~48ms on my machine.I didn't add all the new iterator structs to the typing information concretely, instead just referencing them opaquely as
Iterator[T]
, because these objects only implement magic protocol methods and are not intended to be constructable / picklable / nameable from Python space at all (similar to howlist_iterator
isn't as a builtin). Open to changing that if preferred.I'll mark this as:
custom_vec_iter_impl!
NodeIndices
#1090but feel free to edit this to remove that link if you think there's more places to look for performance here.